Improved compressed indexes for full-text document retrieval

نویسندگان
چکیده

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improved Compressed Indexes for Full-Text Document Retrieval

We give new space/time tradeoffs for compressed indexes that answer document retrieval queries on general sequences. On a collection of D documents of total length n, current approaches require at least |CSA| + O(n lgD lg lgD ) or 2|CSA| + o(n) bits of space, where CSA is a full-text index. Using monotone minimum perfect hash functions, we give new algorithms for document listing with frequenci...

متن کامل

Improved Grammar-Based Compressed Indexes

We introduce the first grammar-compressed representation of a sequence that supports searches in time that depends only logarithmically on the size of the grammar. Given a text T [1..u] that is represented by a (context-free) grammar of n (terminal and nonterminal) symbols and size N (measured as the sum of the lengths of the right hands of the rules), a basic grammar-based representation of T ...

متن کامل

Practical Compressed Document Retrieval

Recent research on document retrieval for general texts has established the virtues of explicitly representing the so-called document array, which stores the document each pointer of the suffix array belongs to. While it makes document retrieval faster, this array occupies a significative amount of redundant space and is not easily compressible. In this paper we present the first practical prop...

متن کامل

MeSH Up: effective MeSH text classification for improved document retrieval

MOTIVATION Controlled vocabularies such as the Medical Subject Headings (MeSH) thesaurus and the Gene Ontology (GO) provide an efficient way of accessing and organizing biomedical information by reducing the ambiguity inherent to free-text data. Different methods of automating the assignment of MeSH concepts have been proposed to replace manual annotation, but they are either limited to a small...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Journal of Discrete Algorithms

سال: 2013

ISSN: 1570-8667

DOI: 10.1016/j.jda.2012.07.005